Studying the Advantages of a Messy Evolutionary Algorithm for Natural Language Tagging
نویسنده
چکیده
The process of labeling each word in a sentence with one of its lexical categories (noun, verb, etc) is called tagging and is a key step in parsing and many other language processing and generation applications. Automatic lexical taggers are usually based on statistical methods, such as Hidden Markov Models, which works with information extracted from large tagged available corpora. This information consists of the frequencies of the contexts of the words, that is, of the sequence of their neighbouring tags. Thus, these methods rely on the assumption that the tag of a word only depends on its surrounding tags. This work proposes the use of a Messy Evolutionary Algorithm to investigate the validity of this assumption. This algorithm is an extension of the fast messy genetic algorithms, a variety of Genetic Algorithms that improve the survival of high quality partial solutions or building blocks. Messy GAs do not require all genes to be present in the chromosomes and they may also appear more than one time. This allows us to study the kind of building blocks that arise, thus obtaining information of possible relationships between the tag of a word and other tags corresponding to any position in the sentence. The paper describes the design of a messy evolutionary algorithm for the tagging problem and a number of experiments on the performance of the system and the parameters of the algorithm.
منابع مشابه
A MODEL FOR EVOLUTIONARY DYNAMICS OF WORDS IN A LANGUAGE
Human language, over its evolutionary history, has emerged as one of the fundamental defining characteristic of the modern man. However, this milestone evolutionary process through natural selection has not left any ’linguistic fossils’ that may enable us to trace back the actual course of development of language and its establishment in human societies. Lacking analytical tools to fathom the cr...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملسیستم برچسب گذاری اجزای واژگانی کلام در زبان فارسی
Abstract: Part-Of-Speech (POS) tagging is essential work for many models and methods in other areas in natural language processing such as machine translation, spell checker, text-to-speech, automatic speech recognition, etc. So far, high accurate POS taggers have been created in many languages. In this paper, we focus on POS tagging in the Persian language. Because of problems in Persian POS t...
متن کاملNatural Language Tagging with Parallel Genetic Algorithms
This work analyzes the relative advantages of different metaheuristic approaches to the well known natural language processing problem of part-of-speech tagging. This consists of assigning to each word of a text its disambiguated part-of-speech according to the context in which the word is used. We have applied a classic genetic algorithm (GA), a CHC algorithm, and a Simulated Annealing (SA). D...
متن کاملNatural language tagging with genetic algorithms
This work analyzes the relative advantages of different metaheuristic approaches to the well-known natural language processing problem of part-of-speech tagging. This consists of assigning to each word of a text its disambiguated part-of-speech according to the context in which the word is used. We have applied a classic genetic algorithm (GA), a CHC algorithm, and a simulated annealing (SA). D...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003